Q-learning and policy iteration algorithms for stochastic shortest path problems

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

LIDS REPORT 2871 1 Q - Learning and Policy Iteration Algorithms for Stochastic Shortest Path Problems ∗

We consider the stochastic shortest path problem, a classical finite-state Markovian decision problem with a termination state, and we propose new convergent Q-learning algorithms that combine elements of policy iteration and classical Q-learning/value iteration. These algorithms are related to the ones introduced by the authors for discounted problems in [BY10b]. The main difference from the s...

متن کامل

Q-learning and policy iteration algorithms for stochastic shortest path problems

متن کامل

Policy Iteration Algorithm for Shortest Path Problems

Abstract. The shortest paths tree problem consists in finding a spanning tree rooted at a given node, in a directed weighted graph, such that for each node i , the path of the tree which goes from i to the root has minimal weight. We propose an algorithm which is a deterministic version of Howard’s policy iteration scheme. We show that policy iteration is faster than the Bellman (or value itera...

متن کامل

ALGORITHMS FOR BIOBJECTIVE SHORTEST PATH PROBLEMS IN FUZZY NETWORKS

We consider biobjective shortest path problems in networks with fuzzy arc lengths. Considering the available studies for single objective shortest path problems in fuzzy networks, using a distance function for comparison of fuzzy numbers, we propose three approaches for solving the biobjective prob- lems. The rst and second approaches are extensions of the labeling method to solve the sing...

متن کامل

Stochastic Shortest Path Games and Q-Learning

We consider a class of two-player zero-sum stochastic games with finite state and compact control spaces, which we call stochastic shortest path (SSP) games. They are total cost stochastic dynamic games that have a cost-free termination state. Based on their close connection to singleplayer SSP problems, we introduce model conditions that characterize a general subclass of these games that have...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Annals of Operations Research

سال: 2012

ISSN: 0254-5330,1572-9338

DOI: 10.1007/s10479-012-1128-z